{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluating Robust Models\n", "\n", "The goal of this notebook is to show how to compare several methods across several datasets.This will also serve as inroduction to two important `scikit-clean` functions: `load_data` and `compare`. \n", "\n", "We'll (roughly) implement the core idea of 3 existing papers on robust classification in the presence of label noise, and see how they compare on our 4 datasets readily available in `scikit-clean`. Those papers are:\n", "\n", "1. Forest-type Regression with General Losses and Robust Forest - ICML'17 (`RobustForest` below in `MODELS` dictionary)\n", "2. An Ensemble Generation Method Based on Instance Hardness- IJCNN'18 (`EGIH`)\n", "3. Classification with label noise- a Markov chain sampling framework - ECML-PKDD'18 (`MCS`)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "import numpy as np\n", "import pandas as pd\n", "from sklearn.tree import DecisionTreeClassifier\n", "from sklearn.neighbors import KNeighborsClassifier\n", "from sklearn.linear_model import LogisticRegression\n", "from sklearn.svm import SVC\n", "from sklearn.neural_network import MLPClassifier\n", "from sklearn.model_selection import StratifiedKFold\n", "from sklearn.metrics import accuracy_score, make_scorer\n", "\n", "from skclean.detectors import KDN, InstanceHardness, MCS\n", "from skclean.handlers import WeightedBagging, SampleWeight, Filter\n", "from skclean.models import RobustForest\n", "from skclean.pipeline import Pipeline, make_pipeline\n", "from skclean.utils import load_data, compare\n", "\n", "import seaborn as sns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll use 4 datasets here, all come preloaded with `scikit-clean`. If you want to load new datasets through this function, put the csv-formatted dataset file in `datasets` folder (use `os.path.dirname(skclean.datasets.__file__)` to get it's location). Make sure labels are at the last column, and features are all real numbers. Check source code of `load_data` for more details." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "DATASETS = ['iris', 'breast_cancer', 'optdigits', 'spambase']\n", "SEED = 42 # For reproducibility\n", "N_JOBS = 8 # No of cpu cores to use in parallel\n", "CV = StratifiedKFold(n_splits=5, shuffle=True, random_state=SEED+1) \n", "SCORING = 'accuracy'" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "MODELS = {\n", " 'RobustForest': RobustForest(n_estimators=100),\n", " 'EGIH':make_pipeline(KDN(), WeightedBagging()),\n", " 'MCS': make_pipeline(MCS(), SampleWeight(LogisticRegression()))\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll create 30% uniform label noise for all our datasets using `UniformNoise`. Note that we're treating noise simulation as data transformation step and attaching it before our models in a pipeline. In this way, noise will only impact training set, and testing will be performed on clean labels." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "from skclean.simulate_noise import UniformNoise\n", "\n", "N_MODELS = {}\n", "for name, clf in MODELS.items():\n", " N_MODELS[name] = make_pipeline(UniformNoise(.3), clf)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`scikit-clean` models are compatible with `scikit-learn` API. So for evaluation, we'll use `cross_val_score` function of scikit-learn- this will create multiple train/test according to the `CV` variable we defined at the beginning, and compute performance. It also allows easily parallelizing the code using `n_jobs`." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "iris, (150, 4), 3, 1.000\n", "\n", "iris, RobustForest: 0.8067 in 0.94 secs\n", "iris, EGIH: 0.9133 in 0.73 secs\n", "iris, MCS: 0.7067 in 0.11 secs\n", "\n", "breast_cancer, (569, 30), 2, 0.594\n", "\n", "breast_cancer, RobustForest: 0.8664 in 0.22 secs\n", "breast_cancer, EGIH: 0.8981 in 0.82 secs\n", "breast_cancer, MCS: 0.9367 in 0.11 secs\n", "\n", "optdigits, (5620, 64), 10, 0.969\n", "\n", "optdigits, RobustForest: 0.9402 in 1.48 secs\n", "optdigits, EGIH: 0.9649 in 8.03 secs\n", "optdigits, MCS: 0.9584 in 6.36 secs\n", "\n", "spambase, (4601, 57), 2, 0.650\n", "\n", "spambase, RobustForest: 0.7857 in 1.12 secs\n", "spambase, EGIH: 0.8581 in 7.18 secs\n", "spambase, MCS: 0.8303 in 0.45 secs\n", "\n" ] } ], "source": [ "from time import perf_counter # Wall time\n", "from sklearn.model_selection import cross_val_score\n", "\n", "for data_name in DATASETS:\n", " X,y = load_data(data_name, stats=True) \n", " \n", " for clf_name, clf in N_MODELS.items():\n", " start_at = perf_counter()\n", " r = cross_val_score(clf, X, y, cv=CV, n_jobs=N_JOBS, scoring=SCORING).mean()\n", " print(f\"{data_name}, {clf_name}: {r:.4f} in {perf_counter()-start_at:.2f} secs\")\n", " print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `compare` function does basically the same thing the above cell does. Plus, it stores the results in a CSV format, with datasets in rows and algorithms in columns. And it can also automatically resume after interruption." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Wall time: 24.5 s\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
RobustForestEGIHMCS
iris0.8066670.8666670.726667
breast_cancer0.848750.8788080.938488
optdigits0.9414590.9585410.959253
spambase0.7767930.8593820.817
\n", "
" ], "text/plain": [ " RobustForest EGIH MCS\n", "iris 0.806667 0.866667 0.726667\n", "breast_cancer 0.84875 0.878808 0.938488\n", "optdigits 0.941459 0.958541 0.959253\n", "spambase 0.776793 0.859382 0.817" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%time\n", "result_path = \"noisy.csv\"\n", "\n", "dfn = compare(N_MODELS, DATASETS, cv=CV, df_path=result_path, random_state=SEED,\n", " scoring=SCORING,n_jobs=N_JOBS, verbose=False)\n", "dfn" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's compare above values with ones computed with clean labels:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
RobustForestEGIHMCS
iris0.9533330.9466670.806667
breast_cancer0.9578330.9402730.950846
optdigits0.9786480.9637010.965125
spambase0.9495770.9387070.850471
\n", "
" ], "text/plain": [ " RobustForest EGIH MCS\n", "iris 0.953333 0.946667 0.806667\n", "breast_cancer 0.957833 0.940273 0.950846\n", "optdigits 0.978648 0.963701 0.965125\n", "spambase 0.949577 0.938707 0.850471" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dfc = compare(MODELS, DATASETS, cv=CV, df_path=None, random_state=SEED,\n", " scoring=SCORING,n_jobs=N_JOBS, verbose=False)\n", "dfc" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "dfc = dfc.assign(label='clean')\n", "dfn = dfn.assign(label='noisy')\n", "df = pd.concat([dfc,dfn]).melt(id_vars='label')\n", "df.rename(columns={'variable':'classifier','value':SCORING},inplace=True)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAEGCAYAAAB/+QKOAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAAgaUlEQVR4nO3de3xV5Z3v8c+PcEkUEAVhOgFMMIziBVBy1DPaqsVLqrYeRxzxYEGtMlhNaas9etSj2Far1WkLsS3l1aOApVpvVJ0i1qoVHbUIytXiGBE0gVIMcpNwSfjNH3sl7iQrZCXZKzvZ+/t+vXix17Oe9azfzmX/sp61nucxd0dERKSxbukOQEREOiclCBERCaUEISIioZQgREQklBKEiIiE6p7uAFJpwIABXlBQkO4wRES6jKVLl37i7oeH7cuoBFFQUMCSJUvSHYaISJdhZuub26cuJhERCaUEISIioZQgREQklBKEiIiEUoIQEZFQShAiIhJKCUJEREJl1DiIzqqsrIzy8vIGZZWVlQDk5+c3qV9UVERpaWmHxCYi0hwliDSprq5OdwgiIgekBNEBwq4Gpk6dCsD06dM7OhwRkUh0D0JEREIpQYiISCglCBERCaUEISIioZQgREQklBKEiIiEUoIQEZFQShAiIhJKA+VEJGNpmpv2UYJIobAfxubU1asbUd0S/eCmR3Pf0+Y+ZPR96vw0zU10ShApVF5ezrJVf6X2oMNarNttrwOwdO2mFuvm7NrS7tgkteL6kKmqquLOO+/kjjvuoH///rGcI5tompv2UYJIsdqDDqP66PNS2mbemgUpbU+ia+5qIK4PmTlz5rBy5Urmzp3Ld77znZS2LdJaShAinURVVRULFy7E3Vm4cCETJ07UVUQnlS1dj3qKSaSTmDNnDvv37wegtraWuXPnpjkiaa3q6uqMusehKwiRTuJPf/oTNTU1ANTU1PDCCy+om6mT6uiux3RRgkihyspKcnZtS/k9g5xdVVRW1qS0Tel8zjrrLBYsWEBNTQ3du3fn7LPPTndIkuWUIEQ6iUmTJrFw4UIAcnJymDhxYpoj6jr0iHk8lCBSKD8/n7/t6R7LU0z5+YNS2qZ0Pv3796ekpIRnn32WkpIS3aBuBT1iHo9YE4SZlQDTgRzg1+5+T6P9hwIPAkcCu4Gr3H1VsG8dsAOoBWrcvTjOWFMlZ9eWSF1M3XZvB2B/bt9IbYISRDaYNGkS69at09VDG+gR89SLLUGYWQ7wc+BsoAJ4y8yecfd3k6rdAixz94vM7Oig/tik/We6+ydxxZhqRUVFkeuWl+9IHDMsygf/oFa1LW3TGbop+vfvz4wZMyK1KRK3OK8gTgLK3X0tgJk9ClwIJCeIY4AfAbj7GjMrMLNB7t7ytV8n1NwHQGs+eCC7+zzTSd0UIg3FmSDygY+TtiuAkxvVWQ78C/CamZ0EHAEMBjYBDvzRzBz4lbvPCjuJmU0GJgMMHTo0pW8gTnl5eekOQUKom0Lkc3EmCAsp80bb9wDTzWwZsBJ4B6h7nvNUd99gZgOBF8xsjbsvatJgInHMAiguLm7cfqegqwER6YriTBAVwJCk7cHAhuQK7r4duBLAzAz4MPiHu28I/v+7mc0n0WXVJEGIiEg84pxq4y1guJkVmllPYDzwTHIFM+sX7AO4Gljk7tvN7GAz6xPUORg4B1gVY6wiItJIbFcQ7l5jZtcDz5N4zPVBd19tZlOC/TOBEcBcM6slcfP6G8Hhg4D5iYsKugO/dfeFccUqAhoJL+EqKysjP62WaYPwYh0H4e4LgAWNymYmvX4DGB5y3FpgVJyxZYJsmVFSJJ2qq6t5f/U7DO1d22LdnvsSnTJ71i9pse5HO3PaHVvcNJI6A2XSbJIdSSPhpTlDe9dyy4nbU9rm3W+3PEg23ZQgurBsmVGyI2kkvMjnlCBEAmGj1SsrK0OvyKprdwOQt7/h09x5eXlNuvY0El66KiUIkUDYFZnu80g2U4IQOQB92Es205KjIiISSglCRERCKUGIiEgoJQgREQmlBCEiIqGUIEREJJQShIiIhFKCEBGRUEoQIiISSglCRERCKUGIiEgoJQgREQmlBCEiIqGUIEREJJQShIiIhFKCEBGRUEoQIiISSglCRERCKUGIiEgoJQgREQmlBCEiIqGUIEREJJQShIiIhFKCEBGRUEoQIiISKtYEYWYlZvaemZWb2c0h+w81s/lmtsLMFpvZcVGPFRGReHWPq2EzywF+DpwNVABvmdkz7v5uUrVbgGXufpGZHR3UHxvxWBERACorK8nZtY28NQtS2m7Orir2mEOPlDbbZcR5BXESUO7ua919L/AocGGjOscALwK4+xqgwMwGRTxWRERiFNsVBJAPfJy0XQGc3KjOcuBfgNfM7CTgCGBwxGMBMLPJwGSAoUOHpiRwkVQqKyujvLy8SXllZSXV1dWR28nLyyM/P79BWVFREaWlpe2OsavLz8/nb3u6U330eSltN2/NAnrv3wFE/z5lkjgThIWUeaPte4DpZrYMWAm8A9REPDZR6D4LmAVQXFwcWkckncrLy3l/9TsM7V3boLx2Vzf214b9qIer3bedPTUb67c/2pmTshhFwsSZICqAIUnbg4ENyRXcfTtwJYCZGfBh8O+glo4V6UqG9q7llhO3p7TNu9/um9L2RBqL8x7EW8BwMys0s57AeOCZ5Apm1i/YB3A1sChIGi0eKyIi8YrtCsLda8zseuB5IAd40N1Xm9mUYP9MYAQw18xqgXeBbxzo2LhiFRGRpuLsYsLdFwALGpXNTHr9BjA86rEiItJxNJJaRERCKUGIiEgoJQgREQmlBCEiIqGUIEREJJQShIiIhFKCEBGRUEoQIiISSglCRERCKUGIiEioWKfaEBHp6vbs2cP63Tkpnz13/Y4cDq6sTGmbqRbpCsLMnjSz881MVxwiIlki6hXEL0ms2zDDzB4HZgdLhIqIZLRevXoxpEd1LOt59Gq0QmBnE+mKwN3/5O4TgBOBdcALZva6mV1pZlm6nLeISGaLfA/CzPoDlwNfJ7E06DzgNGAScEYcwYlkgsrKSj7bkZ192NK1RUoQZvYUcDTwMPBVd69bGPd3ZrYkruDkc80tfB+mrt7UqVNbrKtF70WkOVGvIB5w95fCdrh7cQrjkWY0t/B9mJ77Ej2He9YfOHdr0fuOkZ+fz56ajVnZhy1dW9QEMcLM3nb3rQBmdihwmbv/IrbIpIlUL3yvRe9F5ECiPrZ6TV1yAHD3T4FrYolIREQ6hagJopuZWd2GmeUAPeMJSUREOoOoXUzPA4+Z2UzAgSnAwtiiEhGRtIuaIG4C/g24FjDgj8Cv4wpKJNN8tDPaY66bdiUu6gcdtD9Sm8PbHZlI8yIlCHffT2I09S/jDUck8xQVFUWuuzd4RLnXES0fM7yVbYu0VtRxEMOBHwHHALl15e4+LKa4RDJGa8aZ1I1dmT59elzhiEQW9Sb1QySuHmqAM4G5JAbNiYhIhop6DyLP3V80M3P39cA0M3sVuCPG2EQyWtjo+AONgteod+loURPE7mCq7/fN7HqgEhgYX1gi2SkvLy/dIYjUi5ogvg0cBHwL+AGJbqZJMcUkkhV0NSCdXYsJIhgU96/u/j1gJ4l1IUREJMO1eJPa3WuBMckjqUVEJPNF7WJ6B3g6WE3us7pCd3/qQAeZWQkwHcgBfu3u9zTafwjwG2BoEMv97v5QsG8dsAOoBWo0a6yISMeKmiAOA6qALyeVOdBsggi6pn4OnA1UAG+Z2TPu/m5SteuAd939q2Z2OPCemc1z973B/jPd/ZOIMYqISApFHUndlvsOJwHl7r4WwMweBS4EkhOEA32C7qvewBYSYy1ERCTNoo6kfojEh3kD7n7VAQ7LBz5O2q4ATm5U5wHgGWAD0Ae4NJjWg+B8fzQzB37l7rOaiW0yMBlg6NChLb8ZERGJJGoX038kvc4FLiLxoX4gYTe1GyeZc4FlJLqujgReMLNX3X07cKq7bzCzgUH5Gndf1KTBROKYBVBcXNwkiYmISNtE7WJ6MnnbzB4B/tTCYRXAkKTtwTRNKlcC97i7A+Vm9iGJta8Xu/uG4Nx/N7P5JLqsmiQIERGJR9S5mBobTuLJowN5CxhuZoVm1hMYT6I7KdlHwFgAMxsEHAWsNbODzaxPUH4wcA6wqo2xiohIG0S9B7GDht1DfyOxRkSz3L0mmJbjeRKPuT7o7qvNbEqwfyaJUdmzzWwliS6pm9z9EzMbBswPhl50B37r7lqgSESkA0XtYurTlsbdfQGwoFHZzKTXG0hcHTQ+bi0wqi3nFBGR1IjUxWRmFwWD2uq2+5nZ/4otKhERSbuoTzHd4e7z6zbcfauZ3QH8PpaoRERaKWfXFvLWLGixXrfd2wHYn9vyErA5u7ZAbo92x9ZVRU0QYVcaUY8VEYlVa5ZeLS/fkThm2KAItQdRWVkJNVvbFlgXF/VDfomZ/YTE1BkOlAJLY4tKRKQV4lzWderUqexZv7FNcXV1UR9zLQX2Ar8DHgOqScyjJCIiGSrqU0yfATfHHIuIiHQiUZ9iesHM+iVtH2pmz8cWlYiIpF3ULqYB7r61bsPdP0VrUouIZLSoCWK/mdVPrWFmBYTM7ioiIpkj6lNMtwKvmdkrwfaXCKbYFhGRzBT1JvVCMysmkRSWAU+TeJJJREQyVNTJ+q4GppKYsnsZcArwBg2XIJUYVVZW8tmOHO5+u+XRn1Gt35HDwZWVKWtPRDJL1HsQU4H/Aax39zOBE4DNsUUlIiJpF/UexG53321mmFkvd19jZkfFGpk0kJ+fz56ajdxy4vaUtXn3233plZ+fsvZEJLNETRAVwTiI35NY/vNTWl5yVEREurCoN6kvCl5OM7OXgUMALeAjIpLBWj0jq7u/0nItERHp6tq6JrWIiGQ4JQgREQmlBCEiIqGUIEREJJQShIiIhFKCEBGRUEoQIiISSglCRERCKUGIiEgoJQgREQnV6qk2RES6irKyMsrLyxuU1W1PnTq1Sf2ioiJKS0s7JLauQAlCRLJKXl5eukPoMpQgRCRj6WqgfWK9B2FmJWb2npmVm9nNIfsPMbNnzWy5ma02syujHisiIvGKLUGYWQ7wc+ArwDHAZWZ2TKNq1wHvuvso4Azg382sZ8RjRUQkRnFeQZwElLv7WnffCzwKXNiojgN9zMyA3sAWoCbisSIiEqM470HkAx8nbVcAJzeq8wDwDInlS/sAl7r7fjOLciwAZjYZmAwwdOjQ1ETeSX20M4e73+7bYr1NuxJ5f9BB+1tsb3hKIhORTBRngrCQMm+0fS6wDPgycCSJ9a5fjXhsotB9FjALoLi4OLROJigqKopcd2/wGF+vIw58zPBWtisi2SXOBFEBDEnaHkziSiHZlcA97u5AuZl9CBwd8dis0pqnMeqe754+fXpc4YhIFojzHsRbwHAzKzSznsB4Et1JyT4CxgKY2SDgKGBtxGNFRCRGsV1BuHuNmV0PPA/kAA+6+2ozmxLsnwn8AJhtZitJdCvd5O6fAIQdG1esIiIHkur7f3VtdvZ7gLEOlHP3BcCCRmUzk15vAM6JeqyISEeL4/4fdI17gBpJLSJyANl8/0+zuYqISCglCBERCaUEISIioZQgREQklBKEiIiEUoIQEZFQShAiIhJKCUJEREJpoJxkhX379lFRUcHu3bvTHUqHys3NZfDgwfTo0SPdoUgXpAQhWaGiooI+ffpQUFBAYn2qzOfuVFVVUVFRQWFhYbrDkS5IXUySFXbv3k3//v2zJjkAmBn9+/fPuqsmSR1dQXRhZWVllAeTgyWrK6ubF6ZOUVFRq+aVyTTZlBzqZON7ltRRgshAeXl56Q5BRDKAEkQXls1XA51B79692blzZ7P7161bxwUXXMCqVasit3nFFVdwwQUXMG7cuFSEKNIuugchIiKhlCBE2mnnzp2MHTuWE088keOPP56nn366fl9NTQ2TJk1i5MiRjBs3jl27dgGwdOlSTj/9dMaMGcO5557Lxo0b0xW+SLOUIETaKTc3l/nz5/P222/z8ssvc8MNN+DuALz33ntMnjyZFStW0LdvX37xi1+wb98+SktLeeKJJ1i6dClXXXUVt956a5rfhUhTugch0k7uzi233MKiRYvo1q0blZWVbNq0CYAhQ4Zw6qmnAnD55ZczY8YMSkpKWLVqFWeffTYAtbW1fOELX0hb/CLNUYIQaad58+axefNmli5dSo8ePSgoKKgfe9D4MVMzw9059thjeeONN9IRrkhk6mISaadt27YxcOBAevTowcsvv8z69evr93300Uf1ieCRRx7htNNO46ijjmLz5s315fv27WP16tVpiV3kQJQgRNppwoQJLFmyhOLiYubNm8fRRx9dv2/EiBHMmTOHkSNHsmXLFq699lp69uzJE088wU033cSoUaMYPXo0r7/+ehrfgUg4dTGJtFHdGIgBAwY021307rvvhpaPHj2aRYsWNSmfPXt2yuITaS9dQYiISCglCBERCaUEISIioZQgREQklBKEiIiEUoIQEZFQesxVstL13/0ef/9kS8raGzjgMB74yX2tPm7atGn07t2bG2+8MWWxiKRKrAnCzEqA6UAO8Gt3v6fR/u8BE5JiGQEc7u5bzGwdsAOoBWrcvTjOWCW7/P2TLXww6PTUNbjpldS1JdJJxNbFZGY5wM+BrwDHAJeZ2THJddz9Pncf7e6jgf8LvOLuyX/WnRnsV3KQjDB37lxGjhzJqFGj+PrXv95g3wcffEBJSQljxozhi1/8ImvWrAHg2Wef5eSTT+aEE07grLPOqp8IcNq0aVx11VWcccYZDBs2jBkzZnT4+5HMFuc9iJOAcndf6+57gUeBCw9Q/zLgkRjjEUmr1atXc9ddd/HSSy+xfPlypk+f3mD/5MmTKSsrY+nSpdx///1885vfBOC0007jzTff5J133mH8+PH8+Mc/rj9mzZo1PP/88yxevJg777yTffv2deh7kswWZxdTPvBx0nYFcHJYRTM7CCgBrk8qduCPZubAr9x9VlyBinSEl156iXHjxjFgwAAADjvssPp9O3fu5PXXX+eSSy6pL9uzZw8AFRUVXHrppWzcuJG9e/dSWFhYX+f888+nV69e9OrVi4EDB7Jp0yYGDx7cQe9IMl2cCcJCyryZul8F/rNR99Kp7r7BzAYCL5jZGndvMnmNmU0GJgMMHTq0vTGLxMbdm0z/XWf//v3069ePZcuWNdlXWlrKd7/7Xb72ta/x5z//mWnTptXv69WrV/3rnJwcampqUh22ZLE4u5gqgCFJ24OBDc3UHU+j7iV33xD8/3dgPokuqybcfZa7F7t78eGHH97uoEXiMnbsWB577DGqqqoA2LLl87+H+vbtS2FhIY8//jiQSCbLly8HEtOJ5+fnAzBnzpwOjlqyWZxXEG8Bw82sEKgkkQT+d+NKZnYIcDpweVLZwUA3d98RvD4H+H6MsUqWGTjgsJQ+eTRwwGEt1jn22GO59dZbOf3008nJyeGEE06goKCgfv+8efO49tpr+eEPf8i+ffsYP348o0aNYtq0aVxyySXk5+dzyimn8OGHH6YsbpEDsbq1c2Np3Ow84GckHnN90N3vMrMpAO4+M6hzBVDi7uOTjhtG4qoBEknst+5+V0vnKy4u9iVLlqT0PUhm+Otf/8qIESPSHUZaZPN772hTp04FaPIAQmdmZkube1I01nEQ7r4AWNCobGaj7dnA7EZla4FRccYmIiIHpqk2REQklBKEiIiEUoIQEZFQShAiIhJKCUJEREJpum/JSrfccD3bPtmUsvYOGTCIu//9gZS1B3D77bfzpS99ibPOOiul7YpEpQQhWWnbJ5u46cg1KWvv3g9S1lS9739fY0MlvdTFJNJB1q1bx4gRI7jmmms49thjOeecc6iurmbZsmWccsopjBw5kosuuohPP/0UgCuuuIInnngCgJtvvpljjjmGkSNHcuONN7Jjxw4KCwvrZ2/dvn07BQUFms1VUkoJQqQDvf/++1x33XWsXr2afv368eSTTzJx4kTuvfdeVqxYwfHHH8+dd97Z4JgtW7Ywf/58Vq9ezYoVK7jtttvo06cPZ5xxBn/4wx8AePTRR7n44ovp0aNHOt6WZCglCJEOVFhYyOjRowEYM2YMH3zwAVu3buX00xOr202aNIlFixpOWty3b19yc3O5+uqreeqppzjooIMAuPrqq3nooYcAeOihh7jyyis77o1IVlCCEOlAjafn3rp1a4vHdO/encWLF3PxxRfz+9//npKSEgBOPfVU1q1bxyuvvEJtbS3HHXdcXGFLllKCEEmjQw45hEMPPZRXX30VgIcffrj+aqLOzp072bZtG+eddx4/+9nPGqwZMXHiRC677DJdPUgs9BSTZKVDBgxK6ZNHhwwY1OZj58yZw5QpU9i1axfDhg2r7zaqs2PHDi688EJ2796Nu/PTn/60ft+ECRO47bbbuOyyy9p8fmm9srIyysvLm5TXldXN6lqnqKiI0tLSDoktlZQgJCulesxCFAUFBaxatap++8Ybb6x//eabbzapP3v27PrXixcvDm3ztddeY9y4cfTr1y9lcUrb5eXlpTuElFKCEOmiSktLee6551iwYEHLlSWluuLVQFsoQYh0UWVlZekOQTKcblJL1ohz9cTOKhvfs6SOEoRkhdzcXKqqqrLqA9PdqaqqIjc3N92hSBelLibJCoMHD6aiooLNmzenO5QOlZuby+DBg9MdhnRRShCSFXr06EFhYWG6wxDpUtTFJCIioZQgREQklBKEiIiEskx6qsPMNgPr0x1HTAYAn6Q7CGkzff+6tkz+/h3h7oeH7cioBJHJzGyJuxenOw5pG33/urZs/f6pi0lEREIpQYiISCgliK5jVroDkHbR969ry8rvn+5BiIhIKF1BiIhIKCUIEREJpQTRCmZWa2bLzGyVmT1rZv1aqP9nM2v3o3Fm9m0zOyhpe52ZrQxiWWZm/9zeczRz3tFmdl4cbWeCpJ+Hun83B+XdzexuM3s/ad+tScftDP4vMLNVjdqcZmY3Ih3CzNzMHk7a7m5mm83sP5LKvmJmS8zsr2a2xszuD8qPCn7HlwX7Mu4+hSbra51qdx8NYGZzgOuAuzrgvN8GfgPsSio7091bNXDHzLq7e00rDhkNFANasixc/c9DIz8E/gE43t13m1kf4IYOjUyi+gw4zszy3L0aOBuorNtpZscBDwDnu/saM+sOTA52zwB+6u5PB3WP79jQ46criLZ7A8iH+r+03zSzFWY238wOTap3uZm9Hlx1nBTUb/BXYrCvwMwONrM/mNnyoOxSM/sW8I/Ay2b2cnPBmNkRZvZiEMOLZjY0KJ9tZj8Jjr3XzI40s4VmttTMXjWzo4N6lwTnXG5mi8ysJ/B94NLgL6RLU/z1y0jBld41QKm77wZw9x3uPi2tgcmBPAecH7y+DHgkad//Ae5y9zUA7l7j7r8I9n0BqKir6O4rOyDWDqUE0QZmlgOMBZ4JiuYCN7n7SGAlcEdS9YPd/Z+BbwIPttB0CbDB3Ue5+3HAQnefAWwgccVwZlLdl4MP7r8E2w8Ac4MY5pH466bOPwFnufsNJB7XK3X3McCNQN0P++3Aue4+Cviau+8Nyn7n7qPd/XdRvjZZJq9RF9OlQBHwkbvviNjGkcltAFNii1aa8ygw3sxygZHAX5L2HQcsbea4nwIvmdlzZvadlrqcuyIliNbJC36Jq4DDgBfM7BCgn7u/EtSZA3wp6ZhHANx9EdC3hR+ilcBZZnavmX3R3bcdoO6ZwQf3ycH2/wR+G7x+GDgtqe7j7l5rZr2BfwYeD97Hr0j8FQTwn8BsM7sGyDnAeeVz1cH3YHRzSdTMrgw+/D82syEhbXyQ3AYwM/aopQF3XwEUkLh6iNyd6u4PASOAx4EzgDfNrFcMIaaNEkTr1PU5HwH0JHEPoiWNB5o4UEPDr30ugLv/FzCGRKL4kZnd3o5Yk8/7WfB/N2Brow+1EcG5pwC3AUOAZWbWvx3nzmblwNDgvgPu/lDwM7MNJd7O7Bngfhp2LwGsJvE7GcrdN7j7g+5+IYnf6+PiC7HjKUG0QfCX/bdIdNHsAj41sy8Gu78OvJJU/VIAMzsN2BYcuw44MSg/ESgMXv8jsMvdf0Pih/XEoI0dQJ8WwnodGB+8ngC8FhL3duBDM7skOJ+Z2ajg9ZHu/hd3v53ErJVDIp5Xkrj7LuD/Aw8EXRZ1XZI90xqYtORB4Psh9xHuA24xs38CMLNuZvbd4HWJmfUIXv8D0J+kG9yZQE8xtZG7v2Nmy0l8KE8CZgY3KNcCVyZV/dTMXgf6AlcFZU8CE4NunreA/wrKjwfuM7P9wD7g2qB8FvCcmW1sdB8i2beAB83se8DmRjEkmwD80sxuA3qQ6H9dHpx3OGDAi0HZR8DNQZw/0n2IJuq6HOssdPebgVuBHwCrzGwHUE2i63FDx4coUbh7BTA9pHyFmX0beCT4/XbgD8Huc4DpZrY72P6eu/+tI+LtKJpqQ0REQqmLSUREQilBiIhIKCUIEREJpQQhIiKhlCBERCSUHnMVaYGZTQN2uvv9KWrv9WD6FczsPuA8EiN4PyAxDmZuKs4j0l5KECIdrC45BP4NONzd97S2nTbMzivSKupiEmnEzCYGs+Iut6S1AoJ915jZW8G+J4PBU01mww3KjjWzxcFcTCuCgYjJ60E8AxwM/CWYubd+lt8DzLrbYHbeDvuiSFbSQDmRJGZ2LPAUcKq7f2Jmh5EYpb7T3e83s/7uXhXU/SGwyd3LzGwlUOLulWbWz923mlkZ8Ka7zwumT89x92oz2+nuvYM2kl9PSzrPi8AUd3/fzE4mMZL9y2Y2GxgAXOjutR36xZGsoy4mkYa+DDxRtxiTu28xs+T9xwWJoR/QG3g+KK+bDfcxEgkGEmuG3Gpmg4Gn3P39KAE0mnW3rjh5ltDHlRykI6iLSaQho+kMvMlmA9e7+/HAnXw+E2+T2XDd/bfA10jMxfS8mX05YgzNzrob+Ky5A0VSSQlCpKEXgX+tm+486GJK1gfYGMziOaGuMGw2XDMbBqwNFn16hsRiNC060Ky7Ih1JCUIkibuvJrHO+CvBbL0/aVTl/5FYcewFYE1S+X1mttLMVgGLSMyGeymJGV2XAUeTWHkwqgnAN4IYVgMXtuHtiLSLblKLiEgoXUGIiEgoJQgREQmlBCEiIqGUIEREJJQShIiIhFKCEBGRUEoQIiIS6r8BRF3tb41nkQYAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "sns.boxplot(data=df,hue='label',x='classifier',y=SCORING,width=.4);" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "os.remove(result_path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: This is a simple example, not a replication study, and shouldn't be taken as such." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }